Study Plan

The third part of this nanodegree program covers policy-based methods in deep reinforcement learning. You can find all of the coding exercises from the lessons in this GitHub repository.

## Lessons

Lesson: Introduction to Policy-Based Methods

In this lesson, you will learn about methods such as hill climbing, simulated annealing, and adaptive noise scaling. You'll also learn about cross-entropy methods and evolution strategies.

Lesson: Policy Gradient Methods

In this lesson, you'll study REINFORCE, along with improvements we can make to lower the variance of policy gradient algorithms.

Lesson: Proximal Policy Optimization

In this lesson, you'll learn about Proximal Policy Optimization (PPO), a cutting-edge policy gradient method.

Lesson: Actor-Critic Methods

In this lesson, you'll learn how to combine value-based and policy-based methods, bringing together the best of both worlds, to solve challenging reinforcement learning problems.

Lesson: Deep RL for Finance (Optional)

In this optional lesson, you'll learn how to apply deep reinforcement learning techniques for optimal execution of portfolio transactions.

Resources (Optional)

Read the most famous blog post on policy gradient methods.
Implement a policy gradient method to win at Pong in this Medium post.
Learn more about evolution strategies from OpenAI.